Minimally Supervised Classification to Semantic Categories using Automatically Acquired Symmetric Patterns

نویسندگان

  • Roy Schwartz
  • Roi Reichart
  • Ari Rappoport
چکیده

Classifying nouns into semantic categories (e.g., animals, food) is an important line of research in both cognitive science and natural language processing. We present a minimally supervised model for noun classification, which uses symmetric patterns (e.g., “X and Y”) and an iterative variant of the k-Nearest Neighbors algorithm. Unlike most previous works, we do not use a predefined set of symmetric patterns, but extract them automatically from plain text, in an unsupervised manner. We experiment with four semantic categories and show that symmetric patterns constitute much better classification features compared to leading word embedding methods. We further demonstrate that our simple k-Nearest Neighbors algorithm outperforms two state-ofthe-art label propagation alternatives for this task. In experiments, our model obtains 82%-94% accuracy using as few as four labeled examples per category, emphasizing the effectiveness of simple search and representation techniques for this task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WikiSense: Supersense Tagging of Wikipedia Named Entities Based WordNet

In this paper, we introduce a minimally supervised method for learning to classify named-entity titles in a given encyclopedia into broad semantic categories in an existing ontology. Our main idea involves using overlapping entries in the encyclopedia and ontology and a small set of 30 handed tagged parenthetic explanations to automatically generate the training data. The proposed method involv...

متن کامل

Dynamic categorization of clinical research eligibility criteria by hierarchical clustering

OBJECTIVE To semi-automatically induce semantic categories of eligibility criteria from text and to automatically classify eligibility criteria based on their semantic similarity. DESIGN The UMLS semantic types and a set of previously developed semantic preference rules were utilized to create an unambiguous semantic feature representation to induce eligibility criteria categories through hie...

متن کامل

Automatic Acquisition of Artifact Nouns in French

This article describes a method which allows acquiring artifact nouns in French automatically by extracting predicateargument structures. Two strategies are presented: the supervised strategy and the semi-supervised strategy. In the supervised method, the semantic classes of artifact nouns are recognized by identifying the predicate-argument structures with the syntactic patterns of the given p...

متن کامل

Word Sense Disambiguation Using Semi-Supervised Naive Bayes with Ontological Constraints

Background. Word sense disambiguation (WSD) is the task of mapping an ambiguous word to its correct sense given its context. As high-quality sensetagged data is scarce and expensive to obtain, attention has shifted from fullysupervised to semi-supervised and knowledge-based approaches to WSD that rely on a lexical knowledge base such as WordNet instead of large amounts of hand-labeled data. Wha...

متن کامل

A Supervised Classification Approach for Measuring Relational Similarity between Word Pairs

Measuring the relational similarity between word pairs is important in numerous natural language processing tasks such as solving word analogy questions, classifying nounmodifier relations and disambiguating word senses. We propose a supervised classification method to measure the similarity between semantic relations that exist between words in two word pairs. First, each pair of words is repr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014